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1 Introduction 



The problem of sparse recovery can be traced back to earlier papers from 90s such as [HJ [TOl [9]. 
In 2006 the area of compressed sensing made great progress by two ground breaking papers, 
namely [5] by Candes, Romberg and Tao and [llj by Donoho. The Compressed Sensing problem 
is: Recover x from knowledge of y = where $ is a suitable n x N measurement matrix 
and n < N. Compressed sensing introduces the extra assumption that the arbitrary vector 
X = (xi)^^-^ G M" is k-sparse, if the number of non-zero coefficients of vector x, denoted by 
ll^^llo '■= ■ Xi 7^ 0}, is at most k. More generally, we assume that x is well-approximated by 
a sparse vector. This discovery has a number of potential applications in signal processing, as 
well as other areas of science and technology. 

It is well known now the question can be solved by ^o-minimization: 

min||a;||o subject to y = (1-1) 

Considering the difficulties of this combinatorial optimization problem, actually we solve instead 
the convex problem: 

min||2;||i subject to y = ^x, (1-2) 

where the ip-norm is defined ||x||p = (X]j=i l^^jl^)"'^''''^ as usual. 

The matrix <I> is said to have the Restricted Isometry Property (RIP) of order k if there exists 
a 5k G (0, 1) such that 

< W^xg < {l + 6k)\\x\\l (1.3) 

for all /c-sparse vectors x. Here 6^ is the isometry constant of the matrix the smallest number 
satisfied RIP. Due to [3 O HI HI HI la [I7| et al, the £o and h problems are in fact formally 
equivalent. Actually, if 62k < — 1, the £q problem has an unique A;-sparse solution and the 
solution to the ii problem is that to the Iq problem. In other words, the convex relaxation is 
exact. It has been shown that the solution x* of (1.2) recovers x exactly provided that: (1) x is 
sufficiently sparse and (2) the measurement matrix <I> holds RIP. 

The problem that how to choose a suitable measurement matrix $ must be investigated in 
this field. Most of them are random matrices such as Gaussian or Bernoulli random matrices 
as well as partial Fourier matrices; see [Ul [TE[ [22] . It is known O [7] that random Gaussian or 
Bernoulli matrices, i.e. nxN matrices with independent and normal distributed or Bernoulli dis- 
tributed entries satisfy RIP with probability at least 1—e provided k < Cin \og[N /k)+C2 log e~^, 
where Ci and C2 are constants depending only on 5k- Although Gaussian random matrices are 
optimal for sparse recovery, they have limited use in practice because many measurement tech- 
nologies impose structure on the matrix. 

Recently the restricted isometry constants of a random Toeplitz type or circulant matrix was 
estimated, where the entries of the vector used to generate the Toeplitz or circulant matrices are 
chosen at random according to a suitable probability distribution, which are allowed for providing 
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recovery guarantees for ^i-minimization; see [21 [T6| \T9\ [2H I23| . Compared to Bernoulli or 
Gaussian matrices, random Toeplitz and circulant matrices have the advantage that they require 
a reduced number of random numbers to be generated. More importantly, recovery algorithms 
tend to be more efficient when the matrix admits a fast matrix-vector multiply. Furthermore, 
they arise naturally in certain applications such as identifying a linear time-invariant system. 
They close the theoretical gap by providing recovery guarantees for £i -minimization in connection 
with circulant and Toeplitz type matrices where the necessary number of measurements scales 
linearly with the sparsity. However, their bound is very pessimistic compared to related estimates 
for Bernoulli, Gaussian or partial Fourier matrices. More precisely, the estimated number of 
measurements grows with the sparsity squared, while one would rather expect a linear scaling. 

Now we considerate an x symmetric matrix whose entries r^jS hold Bernoulli distribu- 
tion, i.e. rij takes 1, —1 with probability 1/2 and r^jS are independent for i < j. It also can be 
deduced from the adjacent matrix of a random graph which contains an edge with probability 1/2 
between any two vertices (not necessarily diff'erent!). Choose an arbitrary subset Q C {1,2, ...N} 
of cardinality n < N, and let R be the partial random symmetric Bernoulli matrix of size nx N, 
the submatrix obtained from the above matrix by choosing n rows indexed by Q. Without of 
loss generation, we choose the first n rows. Compared with the matrices mentioned above, it 
has properties of symmetry and few dependent entries in each column, namely it requires less 
random numbers to be generated and there are fast matrix multiplication routines that can be 
exploited in recovery algorithms. 

2 Our contribution 

The main idea of this paper is motivated by [1], as well as some techniques. The key point 
diff'erent to [T] is that, we show Lemma 2.1 below is also valid even the entries in partial random 
symmetric matrix are not independent. Hence this matrix satisfies RIP and can be used as a 
measurement matrix. 

Let A he an N X m matrix each column corresponding an A^-dimensional vector. Let R be 
an n X partial random symmetric matrix. Considering the projection /: 

f : n-^/'^RA =: E. 

That is, the zth column of A is mapped to the ith column of E] and m A^-dimensional vectors 
are projected as m n-dimensional vectors. Furthermore, we want to the projection preserves the 
distance almost invariant, i.e. 

{l-e)\\u- v\\l < \\f{u) - f{v)\\l < (1 + e)||n - v\\l (2.1) 

Let a be a column vector of A. Then f{a) = -^Ra. As / is linear, we may normalize a such 
that a is unit. For convenience in calculation, take R = ( rij/\/Tv] . Let R = {rj, rj, . . . , r^)"^ 
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be the row decomposition of R. Then /(a) = y ^(n • a, • • • , r„ • a) =: y ^{Qi, ' " , Qn)- One 
can get: 

1 N 

E(g,) = 0, E(g2)= E(||/(a)f)= j;e(q2) = i. 



Let 5 = E"=i Q,- Then ||/(a)||2 = 5 x f , and 

Pr[(l - e)||a||i < ||/(a)||i < (1 + 6)||a||l] = Pr [(1 - e)^ < S < {I + e)^ 
Lemma 2.1 

E(n^=iexp(/ig2)) = (E(exp(/iQ?)))". 

This lemma guarantees the partial random matrix R has a similar property of Bernoulli 
matrix discussed in [T], and it leads to the below conclusions obviously. 



Theorem 2.2 



Pr 



5X1+4; 



< exp (--(--- ) ) , Pr 



<exp, 2 



Corollary 2.3 Given any e,/3 > 0, ifn> ^i'^2-e-^/z logm, then with probability 1 — m ^, (2.1) 
holds for any two columns u,v of A. 

Theorem 2.4 For any give < 6 < I, if taking ^{uj) = n^^/'^R, and taking n > c]l^klog{N/k), 
then RIP (1.3) holds for ^{uj) with the prescribed 6 and order k with probability > 1 — 26"*^^", 
where ci,C2 depend only on 5. 

Lemma 2.5 [7J Assume that < — 1- Then the solution x* to (1.2) obeys 

\\x* - x\\i < Co\\x - - x\\2 < CoA;~^/^||x - 

for some constant Cq, where X(;-) is obtained from x by setting all but the k-largest entries to be 
zero. In particular if x is k-sparse, the recovery is exact. 

If the measurements are corrupted with noise, that is 

y = <^>x + z, (2.2) 

where z is an unknown noise term. We will consider the following problem: 

min ||x||i subject to \\y — ^x\\2 < e, (2-3) 

where e is an upper bound on the size of the noisy contribution. 

Lemma 2.6 [7j Assume that 62k < — 1 cind \\z\\2 < e. Then the solution x* to (2.3) obeys 

\\x* — x\\2 < Cok^^^'^Wx — + Cie, 

for some constants Co,Ci. 

So, if we recover a /c-sparse vector x, in Theorem 2.4 taking 6 such that < 6 < \pi — 1, 
and n > Ci^2k log{N/(2k)), using the matrix n^^^'^R as then $ obeys RIP with order 2k and 
6 < \f2 — 1. By Lemma 2.5, with high probability, we could recover x exactly. 



3 Proofs 

Proof of Lemma 2.1. We first prove if taking B = {ri2 = a2,?"i3 = a^, . . . ,rin = an} in 
Z]j=2 expectation E(exp(^"^2 hQ'j)\B) is independent of B. 

n N N 

E(exp(^ hQ^j)\B) = E(exp(/i((aia2 + ^ a^rsfc)^ + • • • + (aia„, + ^ a„r„fc)^)|S) 

j=2 k=2 k=2 

N N 

= E(exp(/i((ai|a2| + ^ Ofc sgn(a2)r2fc)^ H h (ai|a„| + ^ a„ sgn(a„)r„fc)^)|5) 

fc=2 fc=2 
N N 

= E(exp(/i((ai + akr2kf + • • • + (ai + ^ 



k=2 k=2 



Observe 



E(exp(^/iQ2)) = ^ E(exp(^/iQ|)|i?))Pr(i?) 

j=2 02,03,. ..,a„ j=2 



n 



Biexpi^ hQj)\B) Pr(^) 

i=2 a2,a3,...,a„ 

E(exp(^/iQ2)|^) 

J=2 



Now we have 



E(n^=i exp(/iQ2)) = ^ E(exp(/iQ? • exp( J] /iQ,'))|5) Pr(B) 

a2,a3,...,a„ j=2 



n 



Y B{eMhQlmE{exp{YhQ])\B)Fv{B) 

a2,a3,...,a„ j=2 



n 



E(exp(^/iQ2)|B) Y E(exp(/iQ?)|i?)Pr(i3) 

j=2 a2,a3,...,a„ 



E(exp(^/iQ2)|s)E(exp(/iQ?) 



i=2 



E(exp(/iQ2))E(exp(^/iQ2)). 

i=2 



The result holds by induction. 

Lemma 3.1 ^ For all h e [0, and all N >1, 



E(exp(/iQ2)) < ^ (3.1) 
^yl - 2h/N 

mi) < ^- (3.2) 



Proof of Theorem 2.2. The proof is very similar to that in [T, Lemma 5], combining with 
Lemmas 2.1 and 3.1. For arbitrary h > 0, 

77 77 77 

Pr[5 > (1 + e)-] = Fr[exp{hS) > exp(/i(l + e)-)] < E(exp(/i5) exp(-/7(l + e)-)). 
By Lemma 2.1, we get 

E(exp(/i5) = (E(exp(/iQ?)))". 

Thus for any e > 0, 

71 71 

Pr[5> (1 + e)-] < (E(exp(/7Q?))rexp(-/.(l + 6)-)). (3.3) 
Similarly, but this time considering exp(— /iS') for arbitrary h > 0, we get that for any e > 0, 

Pr[5 < (1 - e)|] < {B{eM-hQl))r eMHl - e)^)). (3.4) 



Substituting (3.1) in (3.3) we get (3.5). To optimize the bound we set the derivative in (3.5) 

N e ^ N 
2 1+e ^ 2 



with respect to h to 0. This gives h = ^tt- < ^. Substituting this value of h and series 



expansion yields (3.6). 

= ((!+£) exp(-e))"/2 < exp(-|(tV2 - e'/3)) (3.6) 
Similarly, substituting (3.2) in (3.4) and taking h = yiT^^ S^t 

Fr[S < (1 - 6)^] < exp(-^(6V2 - 6^3)) (3.7) 



Proof of Corollary 2.3. For any column a of A, by Theorem 2.2, 

Pr[||/(a)||i < (1 - e)\\a\\l or ||/(a)||i > (1 + e)\\a\\l] < 2eM-^{e'/2 - e'/3)). 

There are (^) pairs of u, v of the columns of A. So, taking a = u — v in the above inequality, 
we have 



Pr[\\fiu-v)\\i < (l-e)||n-7;||^ or \\fiu-v)\\i > (l+e)||7x-7;||^, for aU u,v] < 2 ^ J exp(--(e72-eV3)). 

Hence, if ri > ^a^2^^J/3 log"^) then 

Pr[(l - e)\\u-v\\l < 11/(^-^)112 < (1 + e)\\u - v\\l for ah u,v] > 1 - m"^. 



Let (0, p) be a probability measure space and let r be a random variable on 0. Given n and 
N, we can generate random matrix <1> by choosing the entries rij (i = 1, . . . ,n; j = 1, . . . , N) as 
(not necessarily independent) realizations of r. This yields the random matrix ^(uj). 
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If the probability distribution generating the matrix holds the following concentrated 
inequality: 

Pr[|||$(a;)2;||^ - ||x||^| > e||3;||^] < 2e-"^»('),0 < e < 1, (3.8) 

where the probability is taken over all n x matrices <^(cj) and co(e) is only depending on e 
and co(e) > for all e, then RIP holds for ^{uj) with high probability; see the following result. 

Lemma 3.2 [3j Suppose that n,N, and < S < 1 are given. If ^(u) satisfies (3.8), then there 
exists constant ci,C2 > depending only on 6 such that RIP (1.3) holds for with the 

prescribed 5 and any k < cin/\og{N/k) with probability > 1 — 26"^^^". 

Remark: 1. In Lemma 3.2, it is valid if taking k < c'in/\\og(N/n) + 1] for c'^ > only 
depending on ci. 

2. If we need the RIP (1.3) holds with order k, we take n > c^^'^'^fc log(A^/A;). So, Theorem 
2.4 is asserted. 

4 Experiments 

Let X be a A;-sparse discrete signal with length 256 whose nonzero entries are 1 or —1. The sensing 
matrix R is partial random symmetric Bernoulli matrix. The classical convex optimization 
algorithm £i-minimization is used for reconstruction. The experimental results are compared 
with those of Bernoulli, random Gaussian, Toeplitz and circulant matrices, where the entries of 
Gaussian matrix are chosen from a normal distribution with mean zero and variance one, the 
Toeplitz matrix is generated by the first two rows of the Gaussian matrix, and the circulant 
matrix is generated by the first row. 

We first analysis the performances of the matrices under different sparsity. Set the measure- 
ment number n = 100. The results of 1000 experiments are summarized in Fig. 4.1, from which 
we see that as the sparsity increases, all the performances decrease. It is hard to distinguish 
which one is the best among Bernoulli matrix (B), Gaussian matrix (G), Toeplitz matrix (T), 
Circulated matrix (C) and R. 

We also investigate the performances of the matrices under different measurement numbers. 
Set the sparsity k = 20. The results of 1000 experiments are summarized and shown in Fig. 4.2. 
When the measurement number n becomes large, the performance of all matrices get better. 
Especially, when n > 95 almost all experiments are successful. 

Next we check the performances of the above sensing matrices through the real image re- 
construction experiment. The original image is shown in Fig. 4.3, with size of 64 x 64 and 
sparsity k = 739. Set measurement number n = 2400. The mean square error (MSE) is defined 
as MSE = " ) where || • \\p being the Frobenius norm, X is the reconstruction and M is 

the original image. The experimental results are shown in Fig. 4.3. 

In practice, the sampled signal usually meets some unavoidable noises. As a result, it is nec- 
essary to check the performances of our sensing matrix R under different noise levels. Gaussian 
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Spare ity K 

Figure 4.1: Success rate as a function of sparsity K 




Measurement Number m 
Figure 4.2: Success rate as a function of measurement number m 

random noise with mean value and standard deviation whose value is chosen from {0, 0.2, 0.4, 
0.6, 0.8, 1.0} is added to the measurement value of the image. Experimental results are shown 
in Fig. 4.4. The increased noise level leads to the poor reconstruction performance. 

5 Conclusion 

As we know the equality 'E{XY) = EX • Ey may hold even if X, Y are not independent. To a 
certain extent the partial random symmetric Bernoulli matrix may have the similar properties 
with Gaussian or Bernoulli matrix. The theoretical analysis and experiment results show that, 
we can use this partial random Bernoulli matrix as the measurement matrix in Compressed 
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Gaussian (MSE=0.0681) Toeplitz(MSE=0.0647) Circulant(MSE=0.0684) 



Figure 4.3: Real world data reconstruction 
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Figure 4.4: Signal Noise Ratio(SNR) under different noise levels 



Sensing. 

Furthermore, there is a relationship between this matrix and random graph. Recall that 



the Erdos-Renyi model 9n(p) consists of all graphs on n vertices in which the edges are chosen 
independently with probability p G (0,1) (see [1]). If letting A{G) be the adjacency matrix 
of a graph G G Sn(l/2), then 2A(G) — J is a random symmetric matrix whose entries hold 
Bernoulli distribution, where J is a matrix consisting of all ones. So it is hopeful to solve some 
CS problems based on random graphs. We will seriously considered it in future work. 
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